{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "thousand-market",
   "metadata": {},
   "source": [
    "# K8s Envoy Test with Changing Model Replica\n",
    "\n",
    "In this test we change the number of replicas to a given model and then look at whether inference requests are being served still.\n",
    "\n",
    "So far we can expect 503 due to the way we do envoy updates.\n",
    "\n",
    "## Prerequisites\n",
    "\n",
    "- (KinD) cluster with 3 triton replica servers. \n",
    "    - One way to do so is to increase triton server `Replicas` to 3 in `k8s/yaml/servers.yaml` and then apply the manifest \n",
    "    - via `kubectl apply -f k8s/yaml/servers.yaml -n seldon-mesh`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "representative-intersection",
   "metadata": {},
   "outputs": [],
   "source": [
    "SCHEDULER_IP=!kubectl get svc seldon-scheduler -n seldon-mesh -o jsonpath='{.status.loadBalancer.ingress[0].ip}'\n",
    "SCHEDULER_IP=SCHEDULER_IP[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "delayed-resort",
   "metadata": {},
   "outputs": [],
   "source": [
    "MESH_IP=!kubectl get svc seldon-mesh -n seldon-mesh -o jsonpath='{.status.loadBalancer.ingress[0].ip}'\n",
    "MESH_IP=MESH_IP[0]"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "described-girlfriend",
   "metadata": {},
   "source": [
    "## Deploy single replica `tfsimple` model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 110,
   "id": "asian-roller",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \n",
      "}\n"
     ]
    }
   ],
   "source": [
    "!grpcurl -d '{\"model\":{ \\\n",
    "              \"meta\":{\"name\":\"tfsimple\"},\\\n",
    "              \"modelSpec\":{\"uri\":\"gs://seldon-models/triton/simple\",\\\n",
    "                           \"requirements\":[\"tensorflow\"],\\\n",
    "                           \"memoryBytes\":500},\\\n",
    "              \"deploymentSpec\":{\"replicas\":1}}}' \\\n",
    "         -plaintext \\\n",
    "         -import-path ../../apis \\\n",
    "         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/LoadModel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 111,
   "id": "waiting-accordance",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{}\n"
     ]
    }
   ],
   "source": [
    "!seldon model status tfsimple -w ModelAvailable --scheduler-host \"$SCHEDULER_IP:9004\" | jq -M ."
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "ec673015",
   "metadata": {},
   "source": [
    "## Scaling up"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 118,
   "id": "bfcefb5b",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Increasing replica count\n",
      "Done\n",
      "{\n",
      "  \n",
      "}\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "for i in {1..1000}; \n",
    "do\n",
    "\n",
    "url=http://${MESH_IP}/v2/models/tfsimple/infer \n",
    "ret=`curl -s -o /dev/null -w \"%{http_code}\" curl -s -o /dev/null -w \"%{http_code}\" \"${url}\" -H \"Content-Type: application/json\" \\\n",
    "        -d '{\"inputs\":[{\"name\":\"INPUT0\",\"data\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],\"datatype\":\"INT32\",\"shape\":[1,16]},{\"name\":\"INPUT1\",\"data\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],\"datatype\":\"INT32\",\"shape\":[1,16]}]}'` \\\n",
    "&& if [ $ret -ne 200 ]; then echo \"Failed with code ${ret}\"; fi &\n",
    "\n",
    "if [[ $i -eq 500 ]]; then\n",
    "    echo \"Increasing replica count\"\n",
    "    grpcurl -d '{\"model\":{ \n",
    "              \"meta\":{\"name\":\"tfsimple\"},\n",
    "              \"modelSpec\":{\"uri\":\"gs://seldon-models/triton/simple\",\n",
    "                           \"requirements\":[\"tensorflow\"],\n",
    "                           \"memoryBytes\":500},\n",
    "              \"deploymentSpec\":{\"replicas\":3}}}' \\\n",
    "         -plaintext \\\n",
    "         -import-path ../../apis \\\n",
    "         -proto ../../apis/mlops/scheduler/scheduler.proto  $SCHEDULER_IP:9004 seldon.mlops.scheduler.Scheduler/LoadModel &\n",
    "fi\n",
    "\n",
    "done\n",
    "echo \"Done\""
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "id": "a05b0546",
   "metadata": {},
   "source": [
    "## Scaling down"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 119,
   "id": "e6319047",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Decrease replica count\n",
      "{\n",
      "  \n",
      "}\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000503\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n",
      "Done\n",
      "Failed with code 000404\n",
      "Failed with code 000503\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n",
      "Failed with code 000404\n"
     ]
    }
   ],
   "source": [
    "%%bash\n",
    "for i in {1..1000}; \n",
    "do\n",
    "\n",
    "url=http://${MESH_IP}/v2/models/tfsimple/infer \n",
    "ret=`curl -s -o /dev/null -w \"%{http_code}\" curl -s -o /dev/null -w \"%{http_code}\" \"${url}\" -H \"Content-Type: application/json\" \\\n",
    "        -d '{\"inputs\":[{\"name\":\"INPUT0\",\"data\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],\"datatype\":\"INT32\",\"shape\":[1,16]},{\"name\":\"INPUT1\",\"data\":[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16],\"datatype\":\"INT32\",\"shape\":[1,16]}]}'` \\\n",
    "&& if [ $ret -ne 200 ]; then echo \"Failed with code ${ret}\"; fi &\n",
    "\n",
    "if [[ $i -eq 500 ]]; then\n",
    "    echo \"Decrease replica count\"\n",
    "    grpcurl -d '{\"model\":{ \n",
    "              \"meta\":{\"name\":\"tfsimple\"},\n",
    "              \"modelSpec\":{\"uri\":\"gs://seldon-models/triton/simple\",\n",
    "                           \"requirements\":[\"tensorflow\"],\n",
    "                           \"memoryBytes\":500},\n",
    "              \"deploymentSpec\":{\"replicas\":1}}}' \\\n",
    "         -plaintext \\\n",
    "         -import-path ../../apis \\\n",
    "         -proto ../../apis/mlops/scheduler/scheduler.proto  $SCHEDULER_IP:9004 seldon.mlops.scheduler.Scheduler/LoadModel &\n",
    "fi\n",
    "\n",
    "done\n",
    "echo \"Done\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 109,
   "id": "3a081743",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \n",
      "}\n"
     ]
    }
   ],
   "source": [
    "!grpcurl -d '{\"model\": {\"name\" : \"tfsimple\"}}' \\\n",
    "         -plaintext \\\n",
    "         -import-path ../../apis/ \\\n",
    "         -proto ../../apis/mlops/scheduler/scheduler.proto  ${SCHEDULER_IP}:9004 seldon.mlops.scheduler.Scheduler/UnloadModel"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4dbd697a",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "seldon-core-v2-python-3.8",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  },
  "vscode": {
   "interpreter": {
    "hash": "dc8b7b3d3b143757bd161269cf4ea949ead8c2ebc12155bae9e3cf57923f4a90"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
